<!DOCTYPE html>

<html>
  <head>
    <meta charset="utf-8">
    
    <title>numpy.histogram_bin_edges &mdash; NumPy v1.18 Manual</title>
    
    <link rel="stylesheet" type="text/css" href="../../_static/css/spc-bootstrap.css">
    <link rel="stylesheet" type="text/css" href="../../_static/css/spc-extend.css">
    <link rel="stylesheet" href="../../_static/scipy.css" type="text/css" >
    <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" >
    <link rel="stylesheet" href="../../_static/graphviz.css" type="text/css" >
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../../',
        VERSION:     '1.18.1',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  false
      };
    </script>
    <script type="text/javascript" src="../../_static/jquery.js"></script>
    <script type="text/javascript" src="../../_static/underscore.js"></script>
    <script type="text/javascript" src="../../_static/doctools.js"></script>
    <script type="text/javascript" src="../../_static/language_data.js"></script>
    <script type="text/javascript" src="../../_static/js/copybutton.js"></script>
    <link rel="author" title="About these documents" href="../../about.html" >
    <link rel="index" title="Index" href="../../genindex.html" >
    <link rel="search" title="Search" href="../../search.html" >
    <link rel="top" title="NumPy v1.18 Manual" href="../../index.html" >
    <link rel="up" title="Statistics" href="../routines.statistics.html" >
    <link rel="next" title="numpy.digitize" href="numpy.digitize.html" >
    <link rel="prev" title="numpy.bincount" href="numpy.bincount.html" > 
  </head>
  <body>
<div class="container">
  <div class="top-scipy-org-logo-header" style="background-color: #a2bae8;">
    <a href="../../index.html">
      <img border=0 alt="NumPy" src="../../_static/numpy_logo.png"></a>
    </div>
  </div>
</div>


    <div class="container">
      <div class="main">
        
	<div class="row-fluid">
	  <div class="span12">
	    <div class="spc-navbar">
              
    <ul class="nav nav-pills pull-left">
        <li class="active"><a href="https://numpy.org/">NumPy.org</a></li>
        <li class="active"><a href="https://numpy.org/doc">Docs</a></li>
        
        <li class="active"><a href="../../index.html">NumPy v1.18 Manual</a></li>
        

          <li class="active"><a href="../index.html" >NumPy Reference</a></li>
          <li class="active"><a href="../routines.html" >Routines</a></li>
          <li class="active"><a href="../routines.statistics.html" accesskey="U">Statistics</a></li> 
    </ul>
              
              
    <ul class="nav nav-pills pull-right">
      <li class="active">
        <a href="../../genindex.html" title="General Index"
           accesskey="I">index</a>
      </li>
      <li class="active">
        <a href="numpy.digitize.html" title="numpy.digitize"
           accesskey="N">next</a>
      </li>
      <li class="active">
        <a href="numpy.bincount.html" title="numpy.bincount"
           accesskey="P">previous</a>
      </li>
    </ul>
              
	    </div>
	  </div>
	</div>
        

	<div class="row-fluid">
      <div class="spc-rightsidebar span3">
        <div class="sphinxsidebarwrapper">
  <h4>Previous topic</h4>
  <p class="topless"><a href="numpy.bincount.html"
                        title="previous chapter">numpy.bincount</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="numpy.digitize.html"
                        title="next chapter">numpy.digitize</a></p>
<div id="searchbox" style="display: none" role="search">
  <h4>Quick search</h4>
    <div>
    <form class="search" action="../../search.html" method="get">
      <input type="text" style="width: inherit;" name="q" />
      <input type="submit" value="search" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    </div>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
          <div class="span9">
            
        <div class="bodywrapper">
          <div class="body" id="spc-section-body">
            
  <div class="section" id="numpy-histogram-bin-edges">
<h1>numpy.histogram_bin_edges<a class="headerlink" href="#numpy-histogram-bin-edges" title="Permalink to this headline">¶</a></h1>
<dl class="function">
<dt id="numpy.histogram_bin_edges">
<code class="sig-prename descclassname">numpy.</code><code class="sig-name descname">histogram_bin_edges</code><span class="sig-paren">(</span><em class="sig-param">a</em>, <em class="sig-param">bins=10</em>, <em class="sig-param">range=None</em>, <em class="sig-param">weights=None</em><span class="sig-paren">)</span><a class="reference external" href="https://github.com/numpy/numpy/blob/v1.18.1/numpy/lib/histograms.py#L473-L672"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#numpy.histogram_bin_edges" title="Permalink to this definition">¶</a></dt>
<dd><p>Function to calculate only the edges of the bins used by the <a class="reference internal" href="numpy.histogram.html#numpy.histogram" title="numpy.histogram"><code class="xref py py-obj docutils literal notranslate"><span class="pre">histogram</span></code></a>
function.</p>
<dl class="field-list">
<dt class="field-odd">Parameters</dt>
<dd class="field-odd"><dl>
<dt><strong>a</strong><span class="classifier">array_like</span></dt><dd><p>Input data. The histogram is computed over the flattened array.</p>
</dd>
<dt><strong>bins</strong><span class="classifier">int or sequence of scalars or str, optional</span></dt><dd><p>If <em class="xref py py-obj">bins</em> is an int, it defines the number of equal-width
bins in the given range (10, by default). If <em class="xref py py-obj">bins</em> is a
sequence, it defines the bin edges, including the rightmost
edge, allowing for non-uniform bin widths.</p>
<p>If <em class="xref py py-obj">bins</em> is a string from the list below, <a class="reference internal" href="#numpy.histogram_bin_edges" title="numpy.histogram_bin_edges"><code class="xref py py-obj docutils literal notranslate"><span class="pre">histogram_bin_edges</span></code></a> will use
the method chosen to calculate the optimal bin width and
consequently the number of bins (see <em class="xref py py-obj">Notes</em> for more detail on
the estimators) from the data that falls within the requested
range. While the bin width will be optimal for the actual data
in the range, the number of bins will be computed to fill the
entire range, including the empty portions. For visualisation,
using the ‘auto’ option is suggested. Weighted data is not
supported for automated bin size selection.</p>
<dl class="simple">
<dt>‘auto’</dt><dd><p>Maximum of the ‘sturges’ and ‘fd’ estimators. Provides good
all around performance.</p>
</dd>
<dt>‘fd’ (Freedman Diaconis Estimator)</dt><dd><p>Robust (resilient to outliers) estimator that takes into
account data variability and data size.</p>
</dd>
<dt>‘doane’</dt><dd><p>An improved version of Sturges’ estimator that works better
with non-normal datasets.</p>
</dd>
<dt>‘scott’</dt><dd><p>Less robust estimator that that takes into account data
variability and data size.</p>
</dd>
<dt>‘stone’</dt><dd><p>Estimator based on leave-one-out cross-validation estimate of
the integrated squared error. Can be regarded as a generalization
of Scott’s rule.</p>
</dd>
<dt>‘rice’</dt><dd><p>Estimator does not take variability into account, only data
size. Commonly overestimates number of bins required.</p>
</dd>
<dt>‘sturges’</dt><dd><p>R’s default method, only accounts for data size. Only
optimal for gaussian data and underestimates number of bins
for large non-gaussian datasets.</p>
</dd>
<dt>‘sqrt’</dt><dd><p>Square root (of data size) estimator, used by Excel and
other programs for its speed and simplicity.</p>
</dd>
</dl>
</dd>
<dt><strong>range</strong><span class="classifier">(float, float), optional</span></dt><dd><p>The lower and upper range of the bins.  If not provided, range
is simply <code class="docutils literal notranslate"><span class="pre">(a.min(),</span> <span class="pre">a.max())</span></code>.  Values outside the range are
ignored. The first element of the range must be less than or
equal to the second. <em class="xref py py-obj">range</em> affects the automatic bin
computation as well. While bin width is computed to be optimal
based on the actual data within <em class="xref py py-obj">range</em>, the bin count will fill
the entire range including portions containing no data.</p>
</dd>
<dt><strong>weights</strong><span class="classifier">array_like, optional</span></dt><dd><p>An array of weights, of the same shape as <em class="xref py py-obj">a</em>.  Each value in
<em class="xref py py-obj">a</em> only contributes its associated weight towards the bin count
(instead of 1). This is currently not used by any of the bin estimators,
but may be in the future.</p>
</dd>
</dl>
</dd>
<dt class="field-even">Returns</dt>
<dd class="field-even"><dl class="simple">
<dt><strong>bin_edges</strong><span class="classifier">array of dtype float</span></dt><dd><p>The edges to pass into <a class="reference internal" href="numpy.histogram.html#numpy.histogram" title="numpy.histogram"><code class="xref py py-obj docutils literal notranslate"><span class="pre">histogram</span></code></a></p>
</dd>
</dl>
</dd>
</dl>
<div class="admonition seealso">
<p class="admonition-title">See also</p>
<p><a class="reference internal" href="numpy.histogram.html#numpy.histogram" title="numpy.histogram"><code class="xref py py-obj docutils literal notranslate"><span class="pre">histogram</span></code></a></p>
</div>
<p class="rubric">Notes</p>
<p>The methods to estimate the optimal number of bins are well founded
in literature, and are inspired by the choices R provides for
histogram visualisation. Note that having the number of bins
proportional to <img class="math" src="../../_images/math/f23dca99aa152271eaf36c79b2f6585d5cc2df51.svg" alt="n^{1/3}"/> is asymptotically optimal, which is
why it appears in most estimators. These are simply plug-in methods
that give good starting points for number of bins. In the equations
below, <img class="math" src="../../_images/math/4c120f773ab4e1c59ad2bd44aae969ce24dd190a.svg" alt="h"/> is the binwidth and <img class="math" src="../../_images/math/a6964690d4ca72cbb84d94823c4ae6ed79783cf2.svg" alt="n_h"/> is the number of
bins. All estimators that compute bin counts are recast to bin width
using the <a class="reference internal" href="numpy.ptp.html#numpy.ptp" title="numpy.ptp"><code class="xref py py-obj docutils literal notranslate"><span class="pre">ptp</span></code></a> of the data. The final bin count is obtained from
<code class="docutils literal notranslate"><span class="pre">np.round(np.ceil(range</span> <span class="pre">/</span> <span class="pre">h))</span></code>.</p>
<dl>
<dt>‘auto’ (maximum of the ‘sturges’ and ‘fd’ estimators)</dt><dd><p>A compromise to get a good value. For small datasets the Sturges
value will usually be chosen, while larger datasets will usually
default to FD.  Avoids the overly conservative behaviour of FD
and Sturges for small and large datasets respectively.
Switchover point is usually <img class="math" src="../../_images/math/c641f9333f8d3a70a4c3a71820e8cc174ada7cb8.svg" alt="a.size \approx 1000"/>.</p>
</dd>
<dt>‘fd’ (Freedman Diaconis Estimator)</dt><dd><div class="math">
<p><img src="../../_images/math/6b1bf2cb787a85208b8a8e37617008829f78fcb9.svg" alt="h = 2 \frac{IQR}{n^{1/3}}"/></p>
</div><p>The binwidth is proportional to the interquartile range (IQR)
and inversely proportional to cube root of a.size. Can be too
conservative for small datasets, but is quite good for large
datasets. The IQR is very robust to outliers.</p>
</dd>
<dt>‘scott’</dt><dd><div class="math">
<p><img src="../../_images/math/0333976256e58040469149d815088741323ebbb6.svg" alt="h = \sigma \sqrt[3]{\frac{24 * \sqrt{\pi}}{n}}"/></p>
</div><p>The binwidth is proportional to the standard deviation of the
data and inversely proportional to cube root of <code class="docutils literal notranslate"><span class="pre">x.size</span></code>. Can
be too conservative for small datasets, but is quite good for
large datasets. The standard deviation is not very robust to
outliers. Values are very similar to the Freedman-Diaconis
estimator in the absence of outliers.</p>
</dd>
<dt>‘rice’</dt><dd><div class="math">
<p><img src="../../_images/math/78f7fcc630bb8bd520b0591732ec2a1a660bfe63.svg" alt="n_h = 2n^{1/3}"/></p>
</div><p>The number of bins is only proportional to cube root of
<code class="docutils literal notranslate"><span class="pre">a.size</span></code>. It tends to overestimate the number of bins and it
does not take into account data variability.</p>
</dd>
<dt>‘sturges’</dt><dd><div class="math">
<p><img src="../../_images/math/e63f03d9dbf00d7a9ba9e8f47d17267798a42b9c.svg" alt="n_h = \log _{2}n+1"/></p>
</div><p>The number of bins is the base 2 log of <code class="docutils literal notranslate"><span class="pre">a.size</span></code>.  This
estimator assumes normality of data and is too conservative for
larger, non-normal datasets. This is the default method in R’s
<code class="docutils literal notranslate"><span class="pre">hist</span></code> method.</p>
</dd>
<dt>‘doane’</dt><dd><div class="math">
<p><img src="../../_images/math/4c43e60e7971dd9f5fef9efa8db9c3eb0987800a.svg" alt="n_h = 1 + \log_{2}(n) +
            \log_{2}(1 + \frac{|g_1|}{\sigma_{g_1}})

g_1 = mean[(\frac{x - \mu}{\sigma})^3]

\sigma_{g_1} = \sqrt{\frac{6(n - 2)}{(n + 1)(n + 3)}}"/></p>
</div><p>An improved version of Sturges’ formula that produces better
estimates for non-normal datasets. This estimator attempts to
account for the skew of the data.</p>
</dd>
<dt>‘sqrt’</dt><dd><div class="math">
<p><img src="../../_images/math/3be23393bcaf22c3cdf72282e4b8e378122eaa2e.svg" alt="n_h = \sqrt n"/></p>
</div><p>The simplest and fastest estimator. Only takes into account the
data size.</p>
</dd>
</dl>
<p class="rubric">Examples</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">arr</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">np</span><span class="o">.</span><span class="n">histogram_bin_edges</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span> <span class="n">bins</span><span class="o">=</span><span class="s1">&#39;auto&#39;</span><span class="p">,</span> <span class="nb">range</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
<span class="go">array([0.  , 0.25, 0.5 , 0.75, 1.  ])</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">np</span><span class="o">.</span><span class="n">histogram_bin_edges</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span> <span class="n">bins</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="go">array([0. , 2.5, 5. ])</span>
</pre></div>
</div>
<p>For consistency with histogram, an array of pre-computed bins is
passed through unmodified:</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">np</span><span class="o">.</span><span class="n">histogram_bin_edges</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
<span class="go">array([1, 2])</span>
</pre></div>
</div>
<p>This function allows one set of bins to be computed, and reused across
multiple histograms:</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">shared_bins</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">histogram_bin_edges</span><span class="p">(</span><span class="n">arr</span><span class="p">,</span> <span class="n">bins</span><span class="o">=</span><span class="s1">&#39;auto&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">shared_bins</span>
<span class="go">array([0., 1., 2., 3., 4., 5.])</span>
</pre></div>
</div>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">group_id</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">hist_0</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="n">arr</span><span class="p">[</span><span class="n">group_id</span> <span class="o">==</span> <span class="mi">0</span><span class="p">],</span> <span class="n">bins</span><span class="o">=</span><span class="n">shared_bins</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">hist_1</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="n">arr</span><span class="p">[</span><span class="n">group_id</span> <span class="o">==</span> <span class="mi">1</span><span class="p">],</span> <span class="n">bins</span><span class="o">=</span><span class="n">shared_bins</span><span class="p">)</span>
</pre></div>
</div>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">hist_0</span><span class="p">;</span> <span class="n">hist_1</span>
<span class="go">array([1, 1, 0, 1, 0])</span>
<span class="go">array([2, 0, 1, 1, 2])</span>
</pre></div>
</div>
<p>Which gives more easily comparable results than using separate bins for
each histogram:</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">hist_0</span><span class="p">,</span> <span class="n">bins_0</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="n">arr</span><span class="p">[</span><span class="n">group_id</span> <span class="o">==</span> <span class="mi">0</span><span class="p">],</span> <span class="n">bins</span><span class="o">=</span><span class="s1">&#39;auto&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">hist_1</span><span class="p">,</span> <span class="n">bins_1</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="n">arr</span><span class="p">[</span><span class="n">group_id</span> <span class="o">==</span> <span class="mi">1</span><span class="p">],</span> <span class="n">bins</span><span class="o">=</span><span class="s1">&#39;auto&#39;</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">hist_0</span><span class="p">;</span> <span class="n">hist_1</span>
<span class="go">array([1, 1, 1])</span>
<span class="go">array([2, 1, 1, 2])</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">bins_0</span><span class="p">;</span> <span class="n">bins_1</span>
<span class="go">array([0., 1., 2., 3.])</span>
<span class="go">array([0.  , 1.25, 2.5 , 3.75, 5.  ])</span>
</pre></div>
</div>
</dd></dl>

</div>


          </div>
        </div>
          </div>
        </div>
      </div>
    </div>

    <div class="container container-navbar-bottom">
      <div class="spc-navbar">
        
      </div>
    </div>
    <div class="container">
    <div class="footer">
    <div class="row-fluid">
    <ul class="inline pull-left">
      <li>
        &copy; Copyright 2008-2019, The SciPy community.
      </li>
      <li>
      Last updated on Feb 20, 2020.
      </li>
      <li>
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 2.4.2.
      </li>
    </ul>
    </div>
    </div>
    </div>
  </body>
</html>